feat(torch): expose optional codegen parameters#619
Conversation
5e043a8 to
d9714e7
Compare
d9714e7 to
d5b04cc
Compare
|
Generated source/header archive for review:
|
d5b04cc to
109b72f
Compare
|
Additional compatibility validation for the latest commit (
Remaining skipped existing base headers in the overlay are schema/name compatibility warnings rather than build failures: |
109b72f to
fd11775
Compare
|
Rebased onto current Additional validation after rebase:
The rebase exposed a wrapper dispatch generation issue for overloads that reuse the same optional parameter names across scalar and Tensor variants ( |
|
Latest operator-bases overlay validation for
Remaining generation warnings are existing base/schema drift cases that are skipped rather than emitted as broken code: |
fd11775 to
9444f9c
Compare
c0db647 to
3e3e319
Compare
9f591db to
70094a1
Compare
327c65e to
87e86ab
Compare
Summary
src/base/<op>.hoverloads when available, forwarding omitted optional/default ATen parameters as typed defaults.std::optional<T>support to operator cache hashing and update the generated torch-op test harness for optional arguments and known vendor-specific PyTorch crashes/divergences.Motivation
The PyTorch code generator previously hid optional ATen schema parameters and always forwarded typed
nulloptvalues. That made generated APIs unable to exercise non-default optional behavior and caused drift against operator base headers that intentionally expose optional parameters. This PR makes optional schema handling explicit while keeping existing hand-written bases as the public API source of truth when they are present.Closes # N/A — this is follow-up work from the PyTorch codegen/base drift discussion.
Type of Change
feat— new feature / new operator / new platformfix— bug fix.perf— performance improvement (no behavioral change).refactor— code restructuring without behavior change.test— adding or fixing tests only.docs— documentation only.build/ci— build system or CI configuration.chore— tooling, formatting, or other non-code changes.Platforms Affected
WITH_CPU)WITH_NVIDIA)WITH_ILUVATAR)WITH_METAX)WITH_CAMBRICON)WITH_MOORE)WITH_ASCEND)WITH_TORCH)Test Results on Supported Platforms
All runs were rebased on current
master, generated PyTorch operator sources before build, installed withWITH_TORCH=ON, and ran full verbose pytest aspython3 -m pytest -vwithouttests/,--devices, or-n.pytestResult6303 passed, 11538 skipped4803 passed, 11520 skipped5803 passed, 10520 skipped3081 passed, 12858 skipped5767 passed, 10574 skipped4480 passed, 11801 skippedValidation details
Representative smoke checks after install confirmed generated PyTorch operator classes and active platform implementations for
InternalSoftmax,LinalgDet,ClampMax, andSpecialPsion every supported platform.The test counts differ from earlier PR-body snapshots because this branch was rebased after splitting generated operator-base files into PR #622 and after switching generation to the local installed PyTorch
native_functions.yaml. PyTorch-backed tests are still collected and executed on every platform.All checks passed on the rebased branch.
Benchmark / Performance Impact
N/A — this PR changes generated API/backend plumbing and tests. The table above records build and test wall time for each platform to support follow-up compile-time optimization work.
Notes for Reviewers
Downstream PR feat(torch): add generated operator bases #622 was regenerated from this PR after the public C++ parameter-name fix (
selfremains an ATen schema name internally, while generated public C++ signatures useinput) and passed full-platform validation withWITH_TORCH=ON. Those results are recorded on PR feat(torch): add generated operator bases #622 to avoid mixing downstream generated-base changes into this PR's own table.Existing
src/base/<op>.hoverloads are treated as the public API when present. The generator binds compatible overloads to ATen schema parameters and fills omitted optional/default schema parameters at the ATen call site.Generated fresh bases now expose supported optional types as
std::optional<...>. PyTorch-internal optional types without stable InfiniOps representations remain hidden and are forwarded as typed empty optionals.The generator reads the locally installed PyTorch
torchgenpackagednative_functions.yaml, so generated op availability follows the PyTorch schema available in the build environment.The test harness skips only known vendor-kernel crashes/divergences that otherwise terminate the Python process or compare mismatched vendor paths; PyTorch-backed tests are still collected and executed on every platform.
Checklist
Title, Branch, and Commits
feat(nvidia): …,fix(cuda/gemm): …).<type>/xxx-yyyy-zzzzwhere<type>matches the PR title's Conventional Commits type and words are joined with hyphens (seeCONTRIBUTING.md§Branches).CONTRIBUTING.md§Pull Requests).master— the branch is rebased cleanly on top of the currentmaster.fixup!/squash!/wipcommits remain.Scope and Design
CONTRIBUTING.md§Code/General).printf/std::cout/print(...)left behind, orTODOwithout an owner and issue link.General Code Hygiene
CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General).the `seqlens_k` tensor) (CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General; §Python).C++ Specific
clang-format --dry-run --Werror src/hash.hpasses.clang-tidywas not run; no kernel or algorithm implementation path is added.CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).src/base/<op>.hor platform implementation directories.new/delete; RAII / smart pointers / existing allocators are used.Python Specific
ruff checkpasses cleanly.ruff format --checkpasses cleanly.CONTRIBUTING.md§Python).CONTRIBUTING.md§Python).CONTRIBUTING.md§Python).if,for, and similar control-flow statements (CONTRIBUTING.md§Python).return, except when it directly follows a control-flow statement likeiforfor(CONTRIBUTING.md§Python).Testing
WITH_TORCH=ON.tests/.pytest.mark.parametrizecorrectly.pytest.mark.auto_act_and_assertis not used by the generator unit tests or generated torch-op harness touched here.dtype/deviceparameterization is relied on, or overridden with an explicitpytest.mark.parametrizewhen necessary.Build, CI, and Tooling
compile_commands.jsonstill regenerates through the existing CMake/scikit-build configuration path.CMakeLists.txtis not changed.ruffandclang-formatchecks are green.pyproject.toml's[project.optional-dependencies].Documentation
Security and Safety